Bayesian Variable Selection in Clustering High-Dimensional Data

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Variable Selection in Clustering High-Dimensional Data With Substructure

In this article we focus on clustering techniques recently proposed for highdimensional data that incorporate variable selection and extend them to the modeling of data with a known substructure, such as the structure imposed by an experimental design. Our method essentially approximates the within-group covariance by facilitating clustering without disrupting the groups defined by the experime...

متن کامل

Bayesian Variable Selection in Clustering High-Dimensional Data

Over the last decade, technological advances have generated an explosion of data with substantially smaller sample size relative to the number of covariates (p n). A common goal in the analysis of such data involves uncovering the group structure of the observations and identifying the discriminating variables. In this article we propose a methodology for addressing these problems simultaneousl...

متن کامل

High-Dimensional Variable Selection for Survival Data

The minimal depth of a maximal subtree is a dimensionless order statistic measuring the predictiveness of a variable in a survival tree. We derive the distribution of the minimal depth and use it for high-dimensional variable selection using random survival forests. In big p and small n problems (where p is the dimension and n is the sample size), the distribution of the minimal depth reveals a...

متن کامل

Attribute Selection for High Dimensional Data Clustering

We present a new method to select an attribute subset (with few or no loss of information) for high dimensional data clustering. Most of existing clustering algorithms loose some of their efficiency in high dimensional data sets. One possible solution is to use only a subset of the whole set of dimensions. But the number of possible dimension subsets is too large to be fully parsed. We use a he...

متن کامل

High-Dimensional Bayesian Clustering with Variable Selection: The R Package bclust

The R package bclust is useful for clustering high-dimensional continuous data. The package uses a parametric spike-and-slab Bayesian model to downweight the effect of noise variables and to quantify the importance of each variable in agglomerative clustering. We take advantage of the existence of closed-form marginal distributions to estimate the model hyper-parameters using empirical Bayes, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the American Statistical Association

سال: 2005

ISSN: 0162-1459,1537-274X

DOI: 10.1198/016214504000001565